CLAIMS: 

1. A method for normalizing input strings, the method comprising the 
steps of: 

(a) receiving the input strings; 

5 (b) linguistically analyzing the input strings to generate a first 

representation of each of the input strings; each of the first representations 
including linguistic information; 

(c) skeletising each of the first representations to generate a 
corresponding second representation for each of the input strings; said 

10 skeletising step replacing the linguistic information with abstract variables in 
each of the second representations; and 

(d) storing the second representation as normalized representations of 
the input strings. 

2. The method of claim 1, wherein said step of linguistically analyzing 
15 comprises performing a plurality of operating functions. 

3. The method of claim 2, wherein said plurality of operating functions 
comprise performing one of morphological analysis, syntactic analysis, and 
semantic analysis. 

4. The method of claim 3, wherein said step of linguistically analyzing 
20 comprises normalizing words according to their base forms. 

5. The method of claim 3, wherein said analysis further comprises the step 
of extracting a syntactic category for individual words. 

6. The method of claim 3, wherein said analysis further comprises the step 
of extracting syntactic information representing string structure. 

25 7. The method of claim 3, wherein said analysis further comprises the step 

of extracting dependency relations between sub-structures of a string. 

8. The method of claim 3, wherein said analysis further comprises 
providing semantic links for individual words. 
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9: metFTod of ^elaim 2, further comprising the step of performing 
machine learnfta^^or selecting^ahicuiar operating functions out of said plurality 
of operating functiont^^^d for determining the processing order. 

9. The method of claim "2rvi(herein said storing further comprises storing 
the operating functions performed on/t^^ormalized representations. 

IS^. The method of claim 1 , wherein^ tn^^bstract variable are tags 
Jndicating the. replaced linguistic^information. . 

^^. The method of claim 1, wherein the normalized representations are 
stored in a database. ^ 



^The method of claim further comprising: 
receiving a query; 

generating a normalized representation of said query by performing steps 
(b) and (c); 

matching the normalized representation of said query to the normalized 
representations stored in the database; and 

retrieving from said database strings identified by said matching step. 



The method of claim 1 , wherein said steps (a) - (d) are performed to 
generate a translation memory comprising a plurality of normalized 
representations of strings in a fir^language and a second language. 

H;^ The method of claim tQ, further comprising the steps of: 

receiving an input string in the first language; 

retrieving a similar string in said first language from said plurality of 
normalized representations, and 

outputting said translation information based on a string in said second 
language which corresponds to said retrieved string in said first language. 

An apparatus for normalizing input strings, the apparatus comprising: 

a text processing unit for: 

receiving the input strings, 
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linguistically analyzing the input strings to generate a first 
representation of each of the input strings; each of the first 
representations including linguistic information, and 

skeletising each of the first representations to generate a 
corresponding second representation for each of the input strings; said 
skeletising replacing the linguistic infornnation with abstract variables in 
each of the second representations; and 

memory for storing the second representation as normalized 
representations of the input strings. 

The apparatus of claim 1X further comprising a query formatting unit 

for: 



receiving quenes; 

linguistically analyzing the queries to generate a first representation of 
each of the queries; each of the first representations including linguistic 
information; and 

skeletising each of the first representations to generate a corresponding 
second representation for each of the queries; said skeletising replacing the 
linguistic information with abstract variables in each of the second 
representations. 

1>v The apparatus of claim 1% further comprising: 

memory for storing the second representation of the queries as normalized 
representations of the queries; 

a matching unit for matching the normalized representations of the input 
strings with the normalized representation of the queries. 

T^. The apparatus of claim ^ further comprising a translation memory for 
storing translations of the input strings. 
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